Search CORE

19 research outputs found

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

Author: Hertrich Christoph
Skutella Martin
Publication venue
Publication date: 04/01/2021
Field of study

The development of a satisfying and rigorous mathematical understanding of the performance of neural networks is a major challenge in artificial intelligence. Against this background, we study the expressive power of neural networks through the example of the classical NP-hard Knapsack Problem. Our main contribution is a class of recurrent neural networks (RNNs) with rectified linear units that are iteratively applied to each item of a Knapsack instance and thereby compute optimal or provably good solution values. We show that an RNN of depth four and width depending quadratically on the profit of an optimum Knapsack solution is sufficient to find optimum Knapsack solutions. We also prove the following tradeoff between the size of an RNN and the quality of the computed Knapsack solution: for Knapsack instances consisting of

n

items, an RNN of depth five and width

w

computes a solution of value at least

1-\mathcal{O}(n^2/\sqrt{w})

times the optimum solution value. Our results build upon a classical dynamic programming formulation of the Knapsack Problem as well as a careful rounding of profit values that are also at the core of the well-known fully polynomial-time approximation scheme for the Knapsack Problem. A carefully conducted computational study qualitatively supports our theoretical size bounds. Finally, we point out that our results can be generalized to many other combinatorial optimization problems that admit dynamic programming solution methods, such as various Shortest Path Problems, the Longest Common Subsequence Problem, and the Traveling Salesperson Problem.Comment: A short version of this paper appears in the proceedings of AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The computational complexity of ReLU network training parameterized by data dimensionality

Author: Froese Vincent
Hertrich Christoph
Niedermeier Rolf
Publication venue: 'AI Access Foundation'
Publication date: 31/05/2021
Field of study

Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension d of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter d and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including `p-loss for all p ∈ [0, ∞]. In particular, we improve a known polynomial-time algorithm for constant d and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases

arXiv.org e-Print Archive

LSE Research Online

Coloring Drawings of Graphs

Author: Hertrich Christoph
Schröder Felix
Steiner Raphael
Publication venue
Publication date: 21/08/2020
Field of study

We consider face-colorings of drawings of graphs in the plane. Given a multi-graph

G

together with a drawing

\Gamma(G)

in the plane with only finitely many crossings, we define a face-

k

-coloring of

\Gamma(G)

to be a coloring of the maximal connected regions of the drawing, the faces, with

k

colors such that adjacent faces have different colors. By the

4

-color theorem, every drawing of a bridgeless graph has a face-

4

-coloring. A drawing of a graph is facially

2

-colorable if and only if the underlying graph is Eulerian. We show that every graph without degree 1 vertices admits a

3

-colorable drawing. This leads to the natural question which graphs

G

have the property that each of its drawings has a

3

-coloring. We say that such a graph

G

is facially

3

-colorable. We derive several sufficient and necessary conditions for this property: we show that every

4

-edge-connected graph and every graph admitting a nowhere-zero

3

-flow is facially

3

-colorable. We also discuss circumstances under which facial

3

-colorability guarantees the existence of a nowhere-zero

3

-flow. On the negative side, we present an infinite family of facially

3

-colorable graphs without a nowhere-zero

3

-flow. On the positive side, we formulate a conjecture which has a surprising relation to a famous open problem by Tutte known as the

3

-flow-conjecture. We prove our conjecture for subcubic and for

K_{3,3}

-minor-free graphs.Comment: 24 pages, 17 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Towards Lower Bounds on the Depth of ReLU Neural Networks

Author: Basu Amitabh
Di Summa Marco
Hertrich Christoph
Skutella Martin
Publication venue
Publication date: 26/10/2021
Field of study

We contribute to a better understanding of the class of functions that is represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning tasks. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). This problem has potential impact on algorithmic and statistical aspects because of the insight it provides into the class of functions represented by neural hypothesis classes. However, to the best of our knowledge, this question has not been investigated in the neural network literature. We also present upper bounds on the sizes of neural networks required to represent functions in these neural hypothesis classes.Comment: Camera-ready version for NeurIPS 2021 conferenc

arXiv.org e-Print Archive

LSE Research Online

Archivio istituzionale della ricerca - Università di Padova

Training Fully Connected Neural Networks is $\exists\mathbb{R}$ -Complete

Author: Bertschinger Daniel
Hertrich Christoph
Jungeblut Paul
Miltzow Tillmann
Weber Simon
Publication venue
Publication date: 04/04/2022
Field of study

We consider the algorithmic problem of finding the optimal weights and biases for a two-layer fully connected neural network to fit a given set of data points. This problem is known as empirical risk minimization in the machine learning community. We show that the problem is

\exists\mathbb{R}

-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Our results hold even if the following restrictions are all added simultaneously.

\bullet

There are exactly two output neurons.

\bullet

There are exactly two input neurons.

\bullet

The data has only 13 different labels.

\bullet

The number of hidden neurons is a constant fraction of the number of data points.

\bullet

The target training error is zero.

\bullet

The ReLU activation function is used. This shows that even very simple networks are difficult to train. The result offers an explanation (though far from a complete understanding) on why only gradient descent is widely successful in training neural networks in practice. We generalize a recent result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021]. This result falls into a recent line of research that tries to unveil that a series of central algorithmic problems from widely different areas of computer science and mathematics are

\exists\mathbb{R}

-complete: This includes the art gallery problem [JACM/STOC 2018], geometric packing [FOCS 2020], covering polygons with convex polygons [FOCS 2021], and continuous constraint satisfaction problems [FOCS 2021].Comment: 38 pages, 18 figure

arXiv.org e-Print Archive

ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation

Author: Hertrich Christoph
Sering Leon
Publication venue
Publication date: 12/11/2021
Field of study

This paper studies the expressive power of artificial neural networks (NNs) with rectified linear units. To study them as a model of real-valued computation, we introduce the concept of Max-Affine Arithmetic Programs and show equivalence between them and NNs concerning natural complexity measures. We then use this result to show that two fundamental combinatorial optimization problems can be solved with polynomial-size NNs, which is equivalent to the existence of very special strongly polynomial time algorithms. First, we show that for any undirected graph with

n

nodes, there is an NN of size

\mathcal{O}(n^3)

that takes the edge weights as input and computes the value of a minimum spanning tree of the graph. Second, we show that for any directed graph with

n

nodes and

m

arcs, there is an NN of size

\mathcal{O}(m^2n^2)

that takes the arc capacities as input and computes a maximum flow. These results imply in particular that the solutions of the corresponding parametric optimization problems where all edge weights or arc capacities are free parameters can be encoded in polynomial space and evaluated in polynomial time, and that such an encoding is provided by an NN

arXiv.org e-Print Archive

LSE Research Online

A Parametric View on Robust Graph Problems

Author: Hertrich Christoph
Publication venue
Publication date: 01/01/2016
Field of study

For some optimization problems on a graph

G=(V,E)

, one can give a general formulation: Let

c\colon E \to \mathbb{R}_{\geq 0}

be a cost function on the edges and

X \subseteq 2^E

be a set of (so-called feasible) subsets of

E

, one aims to minimize

\sum_{e\in S} c(e)

among all feasible

S\in X

. This formulation covers, for instance, the shortest path problem by choosing

X

as the set of all paths between two vertices, or the minimum spanning tree problem by choosing

X

to be the set of all spanning trees. This bachelor thesis deals with a parametric version of this formulation, where the edge costs

c_\lambda\colon E \to \mathbb{R}_{\geq 0}

depend on a parameter

\lambda\in\mathbb{R}_{\geq 0}

in a concave and piecewise linear manner. The goal is to investigate the worst case minimum size of a so-called representation system

R\subseteq X

, which contains for each scenario

\lambda\in\mathbb{R}_{\geq 0}

an optimal solution

S(\lambda)\in R

. It turns out that only a pseudo-polynomial size can be ensured in general, but smaller systems have to exist in special cases. Moreover, methods are presented to find such small systems algorithmically. Finally, the notion of a representation system is relaxed in order to get smaller (i.e. polynomial) systems ensuring a certain approximation ratio

Kaiserslauterer uniweiter elektronischer Dokumentenserver

Scheduling a Proportionate Flow Shop of Batching Machines

Author: Hertrich Christoph
Publication venue
Publication date: 01/01/2018
Field of study

Cutting-edge cancer therapy involves producing individualized medicine for many patients at the same time. Within this process, most steps can be completed for a certain number of patients simultaneously. Using these resources efficiently may significantly reduce waiting times for the patients and is therefore crucial for saving human lives. However, this involves solving a complex scheduling problem, which can mathematically be modeled as a proportionate flow shop of batching machines (PFB). In this thesis we investigate exact and approximate algorithms for tackling many variants of this problem. Related mathematical models have been studied before in the context of semiconductor manufacturing

Kaiserslauterer uniweiter elektronischer Dokumentenserver